Working notebook

Liner shipping shedules and country network construction, analysis (centrailites and community detection) and visualizations

Table of Contents

Process port-to-port dataframe and enrich with metadata information

Metadata get country names from UNSD

for d in list(frame['year'].unique()): data = frame[frame['year'] ==d] data.to_csv("all_data_{}.csv".format(str(d)))

Undirected countries network construction

Convert to undirected countries network with weighted edges using total deployed capacity on a given shipping route

Summary statistics of network measures and properties in 2018

Clustering and transitivity measure the tendency for nodes to cluster together or for edges to form triangles. In our context, they are measures of the extent to which the users interacting with one particular user tend to interact with each other as well. The difference is that transitivity weights nodes with a large degree higher. The clustering coefficient, a measure of the number of triangles in a graph, is calculated as the number of triangles connected to node i divided by the number of sets of two edges connected to node i (Triple nodes). While the transitivity coefficient is calculated as 3 multiply by the number of triangles in the network divided by the number of connected triples of nodes in the network. These two parameters are very important when analyzing social networks because it gives us an insight into how users tend to create tightly knot groups characterized by relatively high-dense ties.

After that, we'll investigate some summary statistics, particularly related to distance, or how far away one node is from another random node. Diameter represents the maximum distance between any pair of nodes while the average distance tells us the average distance between any two nodes in the network.

Now, we are going to focus on network centrality which captures the importance of a node's position in the network considering: degree on the assumption that an important node will have many connections, closeness on the assumption that important nodes are close to other nodes, and finally, betweenness on the assumption that important nodes are well situated and connect other nodes. For this, we are going to use the following functions degree_centrality, closenness_centrality and betwenness_centrality, all which return a list of each node and its centrality score. We will particularly capture the node with the best score in each one.

Now, we can get to see how the Graph looks like. For that, we will use nx.drawing.layout to apply node positioning algorithms for the graph drawing. Specifically, we will use spring_layout that uses force-directed graph drawing which purpose is to position the nodes in two-dimensional space so that all the edges are of equal length and as few crossing edges as possible. It achieves this by assigning forces among the set of edges and nodes based on their relative positions and then uses this to simulate the motion of the edges and nodes. One of the parameters that we can adjust is k, the optimal distance between nodes; as we increase the value, the nodes will farther apart. Once, that we got the positions, we are also going to create a special list so that we can draw the two nodes with higher centrality that we found in different colors to highlight them.

Centralities - leading countries in 2018

Unweighted measures

Communities detection

Library community previously installed using pip install

Documentation for the library is avaiable here: https://python-louvain.readthedocs.io/en/latest/

Louvain algorithm for Community Detection

Girvan-Newman Algorithm

Maximal Cliques


Dynamics

Network measures as time series

Key actors stability for over 14 years of data